Digital Imaging and Preservation - Oversize Color Illustrations
نویسنده
چکیده
Digital imaging can play an important role in helping to preserve brittle library and archival materials, especially those characterized by color and oversize illustrations, neither of which can be satisfactorily preserved by analog methods. Used together, film intermediaries can provide long-term stable copies while digital versions can provide improved scholarly access. Setting benchmarks for high-quality color images is an important aspect of this approach, and the paper describes the results of a project carried out by Columbia University Libraries to use digital imaging in combination with single-frame color microfiche to preserve oversize brittle maps. Analog and digital preservation: the hybrid approach The point from which I come to digital imaging is as the preservation officer of a large academic research library hoping to employ digitization to copy brittle, damaged materials and make them more useful to scholars. Our collections contain millions of printed books, manuscripts, and archival materials. We are especially rich in collections that include graphic materials: architectural drawings, photographs, maps, and print collections in art history and other heavily illustrated disciplines. My job is to preserve the content of all of these media, but as I will discuss later, I have recently been concerned particularly with oversize illustrative materials. The preservation ideal is to stabilize and repair the damaged item itself. Unfortunately, in the case of brittle paper items which exist in multiples -modem published books and archival records are prime examples we must often settle instead for creating stable copies which are as permanent as we can 1 make them. These copies must have two characteristics. First, they must be long lasting. Preservation microfilm, for instance, has a life expectancy of 500 years. Second, scholars must be able to use the 0 Archives & Museum Informatics, 1995 187 Digital Imaging and Preservation copies for most if not all the purposes they would have employed the originals. A long-lasting copy that can not be used is not much improvement over a brittle original that crumbles as the page is turned. This holds as well for digital images. The potential for enhancing materials through digitization is almost unlimited, but if what is digitized is illegible, or so badly indexed that readers cannot find what they need, then we have provided neither preservation nor access. The validity of digitization as a source of long-term preservation is very much an open question. Digital storage media have a short life relative to microfilm, and even relative to acid paper, which can last 50 to 100 years when carefully handled. In contrast, disks and tapes last twenty to thirty years at best. More troubling, software and hardware change with almost frightening speed and leave older iterations behind, their files all too often unreadable. We do not know whether digital files will change under repeated refreshment and migration through generations of software and hardware.2 Especially when the original books or manuscripts will not survive long after scanning, we cannot afford to rely on a digital copy as our only record of what the item was. At least for the present, therefore, digital and analog preservation need to go hand-in-hand. For true long-term security, we must assure that we have the most durable analog version we can achieve: the original item itself, properly repaired, or a copy on film housed in permanent archival storage. Complementing the analog version we then create a "digital preservation version" by scanning at the highest resolution needed to produce full legibility, with grayscale or color (as appropriate) carefully matched to the original; and saving it in a lossless format. No enhancement is carried out, since the digital preservation version's role is to present an accurate record of the original for scholarly purposes. Certainly digitization offers tools to enhance images a manuscript with text obscured by coffee stains can be digitally altered to be more legible, but this changes the facts of what the original really was at the time of scanning. Authenticity and accuracy in representing the original are particularly at issue if the original will be discarded after scanning. From the digital preservation version we can then derive as many copies as desired, and it is these use copies that can be enhanced and manipulated at will. The point is to open up all the possibilities of use without compromising the authenticity of the digital preservation version; and to maintain the analog version for those who will need to consult it, and of course in case of accidental loss or change to the digital preservation version. Should it ever be needed or desired, we will have the analog version to re-scan. 1 Permanence is of course a relative concept. See for instance James O'Toole, "On the Idea of Permanence," American Archivist 1989, pp. 10-25. 2 Cf. the recent discussion by Jeff Rothenberg, "Ensuring the Longevity of Digital Documents," ScientiJc American 1995, pp.42-47. Q Archives & Museum Informatics, 1995 188 Multimedia Computing and Museums ICHIM '95 MCN '95 This model, the "hybrid approach as it has been ~ a l l e d , ~ assumes that it is in fact possible to make a digital version of the original, that digitization is appropriate for preserving that item, and that the digital copy is good enough to serve the needs that justified selecting the item for preservation in the first place. Much of the current work in digital applications for preservation is aimed at determining whether the types of access which scholars require and desire can be provided by a digital version, and what level of quality is needed to capture at least as much information (preferably more) as the traditional methodologies. Image quality issues How do we decide how good is good enough? How can we set definitions for appropriate image quality? Capture of graphic materials is complex, encompassing continuous tone, halftone, or color illustrations, all mixed with black and white text. Not only do books include illustrations, but words and numbers are found within graphic media, for instance on architectural drawings or maps. In most of these cases legibility of the textual elements alone is far from sufficient to define an adequate copy. Some experts advocate capturing all materials at the highest technically possible resolution and pixel depth (i.e., dynamic range) as the ideal preservation digitization goal, regardless of the nature of the originals, in order to assure that all potential uses will be met and to assure that a second, better scan will not be needed in future. But the higher the quality of capture, the larger the file, and the higher the cost of capture, retrieval and manipulation, and storage media. Funds are always limited, and going beyond the quality reasonably needed consumes money which could otherwise be employed in preserving further items. Rather than always aiming for the highest technically possible quality, it is essential to determine what the upper limits of quality are that will fully meet foreseeable needs, including different uses of the same image by different disciplines. A historian concentrating on dating a manuscript may well want any stains preserved for the historical evidence they convey. Another scholar may prefer to see what is underneath the stain. A geologist concerned with coding on a map needs legibility and distinguishable colors, while an art historian concerned with the aesthetics of the map needs color faithful to the original. All of these variations imply different requirements for the digitization of the original items. 3 Don Willis, A Hybrid Systems Approach to Preservation ofPrintedMaterials, Washington DC, Commission on Preservation and Access, 1992 O Archives & Museum Informatics, 1995 189 Digital Imaging and Preservation The definition of a successful digital imaging project will at minimum require decisions on pixel depth: whether to capture information simply in black or white (binary), grayscale (256 gradations captured using 8 bits per pixel), or color (normally using 24-bits per pixel to give 16 million different potential colors). While a binary scan of 600 dots per inch is clearly superior to one at 300 dpi, a binary scan at 600 dpi is not so simply compared to a 300 dpi grayscale scan. The 8 bits of grayscale convey additional information not carried in the binary version, and color adds even more.4 Since file size increases immensely for color, it is not feasible to simply scan everything at high resolution and 24-bit color, despite the potential for great accuracy of reproduction when color is properly handled. 5 Substantial work has already been done on the level of quality required to capture the content of black and white printed text accurately. The Cornell University Preservation Department has been studying published 19-20th century volumes and has established that a 600 dpi binary scan will accurately and legibly capture black print on light background down to 4-point type, the size of the smallest picture captions and footnotes (where the lower case letter e is approximately 1 mm high).6 The Yale University Preservation Department's Project Open Book investigates the level of quality required to capture black-and-white text from preservation-quality microfilm. Again a 600 dpi binary scan appears to be the highest resolution needed for accurate capture. 7 Based on this work, Cornell is testing a series of benchmarks to predict what resolution is needed for materials of different sorts based on the size of the smallest letter which must be legible.8 They have devised formulae for binary and grayscale to determine how many dots per inch are required to produce resolution equivalent to that of microfilm at quality index level 8. Quality index is a measurement of the degree of resolution provided on microfilm, and national standards require that the third generation 9 microfilm (the service copy, the copy people actually read) must have a QI of 8. Cornell's benchmarks should help assure an equivalent level of quality for digitized text, as well as allowing quick Cf. the discussion in Michael Ester, "Image Quality and Viewer Perception", fisual Resources, 1991 pp.51-63 A chart of file sizes for binary, grayscale, and color images is provided in Peter Robinson, The Digitization of Primary Textual Sources, Oxford: Oxford University for Humanities Communication, 1993, pp.11-12. Anne Kenney, "Digital-to-Microfilm Conversion: An Interim Preservation Solution," Library Resources & Technical Services 1993, pp.380-402; and 1994, pp.87-95, especially pp.88-89. Paul Conway and Shari Weaver, The Setup Phase ofproject Open Book, Washington, DC: Commission on Preservation and Access, 1994. Anne Kenney and Stephen Chapman, Digital Resolution Requirements for Replacing Text-BasedMaterial: Methods for Benchmarking Image Quality, Washington, DC: Commission on Preservation and Access, 1995. RLG Preservation Microjilming Handbook, Mountain View, CA: Research Libraries Group, 1992, p.41.
منابع مشابه
Recognition of Sequence of Print and Ink Strokes: Investigation the Effect of Handwriting Pressure, Hue of Ink, Printer and Paper Type
By introducing of digital techniques, forensic document examiners has been encouraged to work with better accuracy in non-destructive ways. The aim of this study was to present a non-destructive, accessible, economic (affordable), user friendly, portable, useful and easy technique for specifying the order of crossing lines of ink stroke and printed text. The intersections of LaserJet and In...
متن کاملSpectral Estimation of Printed Colors Using a Scanner, Conventional Color Filters and applying backpropagation Neural Network
Reconstruction the spectral data of color samples using conventional color devices such as a digital camera or scanner is always of interest. Nowadays, multispectral imaging has introduced a feasible method to estimate the spectral reflectance of the images utilizing more than three-channel imaging. The goal of this study is to spectrally characterize a color scanner using a set of conventional...
متن کاملMEG BELLINGER Digital Imaging: Issues for Preservation and Access
This discussion outlines some of the issues that must be considered before digital imaging of paper-based research material should be adopted as a preservation method. In addition, the quality of the digital image in terms of resolution and pixel depth, as well as issues of authenticity, verification, and bibliographic integrity will be discussed. In this context, issues associated with preserv...
متن کاملColor scene transform between images using Rosenfeld-Kak histogram matching method
In digital color imaging, it is of interest to transform the color scene of an image to the other. Some attempts have been done in this case using, for example, lαβ color space, principal component analysis and recently histogram rescaling method. In this research, a novel method is proposed based on the Resenfeld and Kak histogram matching algorithm. It is suggested that to transform the color...
متن کاملReview of Suppression of Artifacts for Example-Based Video Color Transfer
In today’s technological era human beings are mostly interested in quality and purity of photographs or images. With the digital camera’s invention, capturing images has become extremely easy and widespread, while image data has become more robust and easy to manipulate. Among the many possible image-processing options, users have become increasingly interested in changing an image’s tone or mo...
متن کامل